Overview

Dataset statistics

Number of variables13
Number of observations1599
Missing cells0
Missing cells (%)0.0%
Duplicate rows240
Duplicate rows (%)15.0%
Total size in memory162.5 KiB
Average record size in memory104.1 B

Variable types

NUM12
BOOL1

Warnings

Dataset has 240 (15.0%) duplicate rows Duplicates
citric acid has 132 (8.3%) zeros Zeros

Reproduction

Analysis started2020-09-23 16:28:47.730961
Analysis finished2020-09-23 16:29:33.332550
Duration45.6 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

fixed acidity
Real number (ℝ≥0)

Distinct96
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.319637273
Minimum4.6
Maximum15.9
Zeros0
Zeros (%)0.0%
Memory size12.5 KiB
2020-09-23T13:29:33.575376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum4.6
5-th percentile6.1
Q17.1
median7.9
Q39.2
95-th percentile11.8
Maximum15.9
Range11.3
Interquartile range (IQR)2.1

Descriptive statistics

Standard deviation1.741096318
Coefficient of variation (CV)0.2092755082
Kurtosis1.132143398
Mean8.319637273
Median Absolute Deviation (MAD)1
Skewness0.9827514413
Sum13303.1
Variance3.031416389
MonotocityNot monotonic
2020-09-23T13:29:33.881365image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
7.2674.2%
 
7.1573.6%
 
7.8533.3%
 
7.5523.3%
 
7503.1%
 
7.7493.1%
 
6.8462.9%
 
7.6462.9%
 
8.2452.8%
 
7.4442.8%
 
Other values (86)109068.2%
 
ValueCountFrequency (%) 
4.610.1%
 
4.710.1%
 
4.910.1%
 
560.4%
 
5.140.3%
 
ValueCountFrequency (%) 
15.910.1%
 
15.620.1%
 
15.520.1%
 
1520.1%
 
14.310.1%
 

volatile acidity
Real number (ℝ≥0)

Distinct143
Distinct (%)8.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5278205128
Minimum0.12
Maximum1.58
Zeros0
Zeros (%)0.0%
Memory size12.5 KiB
2020-09-23T13:29:34.171063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.12
5-th percentile0.27
Q10.39
median0.52
Q30.64
95-th percentile0.84
Maximum1.58
Range1.46
Interquartile range (IQR)0.25

Descriptive statistics

Standard deviation0.1790597042
Coefficient of variation (CV)0.3392435493
Kurtosis1.22554225
Mean0.5278205128
Median Absolute Deviation (MAD)0.12
Skewness0.6715925724
Sum843.985
Variance0.03206237765
MonotocityNot monotonic
2020-09-23T13:29:34.467649image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.6472.9%
 
0.5462.9%
 
0.43432.7%
 
0.59392.4%
 
0.36382.4%
 
0.58382.4%
 
0.4372.3%
 
0.49352.2%
 
0.38352.2%
 
0.39352.2%
 
Other values (133)120675.4%
 
ValueCountFrequency (%) 
0.1230.2%
 
0.1620.1%
 
0.18100.6%
 
0.1920.1%
 
0.230.2%
 
ValueCountFrequency (%) 
1.5810.1%
 
1.3320.1%
 
1.2410.1%
 
1.18510.1%
 
1.1810.1%
 

citric acid
Real number (ℝ≥0)

ZEROS

Distinct80
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2709756098
Minimum0
Maximum1
Zeros132
Zeros (%)8.3%
Memory size12.5 KiB
2020-09-23T13:29:34.781696image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.09
median0.26
Q30.42
95-th percentile0.6
Maximum1
Range1
Interquartile range (IQR)0.33

Descriptive statistics

Standard deviation0.1948011374
Coefficient of variation (CV)0.7188880858
Kurtosis-0.7889975154
Mean0.2709756098
Median Absolute Deviation (MAD)0.17
Skewness0.3183372953
Sum433.29
Variance0.03794748313
MonotocityNot monotonic
2020-09-23T13:29:35.102851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01328.3%
 
0.49684.3%
 
0.24513.2%
 
0.02503.1%
 
0.26382.4%
 
0.1352.2%
 
0.08332.1%
 
0.01332.1%
 
0.21332.1%
 
0.32322.0%
 
Other values (70)109468.4%
 
ValueCountFrequency (%) 
01328.3%
 
0.01332.1%
 
0.02503.1%
 
0.03301.9%
 
0.04291.8%
 
ValueCountFrequency (%) 
110.1%
 
0.7910.1%
 
0.7810.1%
 
0.7630.2%
 
0.7510.1%
 

residual sugar
Real number (ℝ≥0)

Distinct91
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.538805503
Minimum0.9
Maximum15.5
Zeros0
Zeros (%)0.0%
Memory size12.5 KiB
2020-09-23T13:29:35.410208image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.9
5-th percentile1.59
Q11.9
median2.2
Q32.6
95-th percentile5.1
Maximum15.5
Range14.6
Interquartile range (IQR)0.7

Descriptive statistics

Standard deviation1.40992806
Coefficient of variation (CV)0.5553509545
Kurtosis28.61759542
Mean2.538805503
Median Absolute Deviation (MAD)0.3
Skewness4.540655426
Sum4059.55
Variance1.987897133
MonotocityNot monotonic
2020-09-23T13:29:35.706804image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
21569.8%
 
2.21318.2%
 
1.81298.1%
 
2.11288.0%
 
1.91177.3%
 
2.31096.8%
 
2.4865.4%
 
2.5845.3%
 
2.6794.9%
 
1.7764.8%
 
Other values (81)50431.5%
 
ValueCountFrequency (%) 
0.920.1%
 
1.280.5%
 
1.350.3%
 
1.4352.2%
 
1.5301.9%
 
ValueCountFrequency (%) 
15.510.1%
 
15.420.1%
 
13.910.1%
 
13.820.1%
 
13.410.1%
 

chlorides
Real number (ℝ≥0)

Distinct153
Distinct (%)9.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.08746654159
Minimum0.012
Maximum0.611
Zeros0
Zeros (%)0.0%
Memory size12.5 KiB
2020-09-23T13:29:36.005012image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.012
5-th percentile0.054
Q10.07
median0.079
Q30.09
95-th percentile0.1261
Maximum0.611
Range0.599
Interquartile range (IQR)0.02

Descriptive statistics

Standard deviation0.04706530201
Coefficient of variation (CV)0.5380949236
Kurtosis41.71578725
Mean0.08746654159
Median Absolute Deviation (MAD)0.01
Skewness5.680346572
Sum139.859
Variance0.002215142653
MonotocityNot monotonic
2020-09-23T13:29:36.282956image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.08664.1%
 
0.074553.4%
 
0.076513.2%
 
0.078513.2%
 
0.084493.1%
 
0.077472.9%
 
0.071472.9%
 
0.082462.9%
 
0.075452.8%
 
0.079432.7%
 
Other values (143)109968.7%
 
ValueCountFrequency (%) 
0.01220.1%
 
0.03410.1%
 
0.03820.1%
 
0.03940.3%
 
0.04140.3%
 
ValueCountFrequency (%) 
0.61110.1%
 
0.6110.1%
 
0.46710.1%
 
0.46410.1%
 
0.42210.1%
 

free sulfur dioxide
Real number (ℝ≥0)

Distinct60
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.87492183
Minimum1
Maximum72
Zeros0
Zeros (%)0.0%
Memory size12.5 KiB
2020-09-23T13:29:36.581061image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q17
median14
Q321
95-th percentile35
Maximum72
Range71
Interquartile range (IQR)14

Descriptive statistics

Standard deviation10.46015697
Coefficient of variation (CV)0.6589107704
Kurtosis2.023562046
Mean15.87492183
Median Absolute Deviation (MAD)7
Skewness1.250567293
Sum25384
Variance109.4148838
MonotocityNot monotonic
2020-09-23T13:29:36.878370image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
61388.6%
 
51046.5%
 
10794.9%
 
15784.9%
 
12754.7%
 
7714.4%
 
9623.9%
 
16613.8%
 
17603.8%
 
11593.7%
 
Other values (50)81250.8%
 
ValueCountFrequency (%) 
130.2%
 
210.1%
 
3493.1%
 
4412.6%
 
51046.5%
 
ValueCountFrequency (%) 
7210.1%
 
6820.1%
 
6610.1%
 
5710.1%
 
5520.1%
 

total sulfur dioxide
Real number (ℝ≥0)

Distinct144
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46.46779237
Minimum6
Maximum289
Zeros0
Zeros (%)0.0%
Memory size12.5 KiB
2020-09-23T13:29:37.194650image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile11
Q122
median38
Q362
95-th percentile112.1
Maximum289
Range283
Interquartile range (IQR)40

Descriptive statistics

Standard deviation32.89532448
Coefficient of variation (CV)0.7079166623
Kurtosis3.809824488
Mean46.46779237
Median Absolute Deviation (MAD)18
Skewness1.515531258
Sum74302
Variance1082.102373
MonotocityNot monotonic
2020-09-23T13:29:37.492313image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
28432.7%
 
24362.3%
 
18352.2%
 
15352.2%
 
23342.1%
 
20332.1%
 
14332.1%
 
31322.0%
 
38311.9%
 
27301.9%
 
Other values (134)125778.6%
 
ValueCountFrequency (%) 
630.2%
 
740.3%
 
8140.9%
 
9140.9%
 
10271.7%
 
ValueCountFrequency (%) 
28910.1%
 
27810.1%
 
16510.1%
 
16010.1%
 
15510.1%
 

density
Real number (ℝ≥0)

Distinct436
Distinct (%)27.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9967466792
Minimum0.99007
Maximum1.00369
Zeros0
Zeros (%)0.0%
Memory size12.5 KiB
2020-09-23T13:29:37.801489image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.99007
5-th percentile0.993598
Q10.9956
median0.99675
Q30.997835
95-th percentile1
Maximum1.00369
Range0.01362
Interquartile range (IQR)0.002235

Descriptive statistics

Standard deviation0.001887333954
Coefficient of variation (CV)0.001893494098
Kurtosis0.9340790655
Mean0.9967466792
Median Absolute Deviation (MAD)0.00113
Skewness0.07128766295
Sum1593.79794
Variance3.562029453e-06
MonotocityNot monotonic
2020-09-23T13:29:38.112644image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.9972362.3%
 
0.9976352.2%
 
0.9968352.2%
 
0.998291.8%
 
0.9962281.8%
 
0.9978261.6%
 
0.9964251.6%
 
0.9994241.5%
 
0.997241.5%
 
0.9966231.4%
 
Other values (426)131482.2%
 
ValueCountFrequency (%) 
0.9900720.1%
 
0.990210.1%
 
0.9906420.1%
 
0.990810.1%
 
0.9908410.1%
 
ValueCountFrequency (%) 
1.0036920.1%
 
1.003210.1%
 
1.0031530.2%
 
1.0028910.1%
 
1.002620.1%
 

pH
Real number (ℝ≥0)

Distinct89
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.311113196
Minimum2.74
Maximum4.01
Zeros0
Zeros (%)0.0%
Memory size12.5 KiB
2020-09-23T13:29:38.411854image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2.74
5-th percentile3.06
Q13.21
median3.31
Q33.4
95-th percentile3.57
Maximum4.01
Range1.27
Interquartile range (IQR)0.19

Descriptive statistics

Standard deviation0.1543864649
Coefficient of variation (CV)0.04662675535
Kurtosis0.8069425082
Mean3.311113196
Median Absolute Deviation (MAD)0.1
Skewness0.1936834981
Sum5294.47
Variance0.02383518055
MonotocityNot monotonic
2020-09-23T13:29:38.739181image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3.3573.6%
 
3.36563.5%
 
3.26533.3%
 
3.38483.0%
 
3.39483.0%
 
3.29462.9%
 
3.32452.8%
 
3.34432.7%
 
3.28422.6%
 
3.35392.4%
 
Other values (79)112270.2%
 
ValueCountFrequency (%) 
2.7410.1%
 
2.8610.1%
 
2.8710.1%
 
2.8820.1%
 
2.8940.3%
 
ValueCountFrequency (%) 
4.0120.1%
 
3.920.1%
 
3.8510.1%
 
3.7820.1%
 
3.7510.1%
 

sulphates
Real number (ℝ≥0)

Distinct96
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.658148843
Minimum0.33
Maximum2
Zeros0
Zeros (%)0.0%
Memory size12.5 KiB
2020-09-23T13:29:39.063810image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.33
5-th percentile0.47
Q10.55
median0.62
Q30.73
95-th percentile0.93
Maximum2
Range1.67
Interquartile range (IQR)0.18

Descriptive statistics

Standard deviation0.1695069796
Coefficient of variation (CV)0.2575511321
Kurtosis11.72025073
Mean0.658148843
Median Absolute Deviation (MAD)0.08
Skewness2.428672354
Sum1052.38
Variance0.02873261613
MonotocityNot monotonic
2020-09-23T13:29:39.348112image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.6694.3%
 
0.58684.3%
 
0.54684.3%
 
0.62613.8%
 
0.56603.8%
 
0.57553.4%
 
0.59513.2%
 
0.53513.2%
 
0.55503.1%
 
0.63483.0%
 
Other values (86)101863.7%
 
ValueCountFrequency (%) 
0.3310.1%
 
0.3720.1%
 
0.3960.4%
 
0.440.3%
 
0.4250.3%
 
ValueCountFrequency (%) 
210.1%
 
1.9810.1%
 
1.9520.1%
 
1.6210.1%
 
1.6110.1%
 

alcohol
Real number (ℝ≥0)

Distinct65
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.42298311
Minimum8.4
Maximum14.9
Zeros0
Zeros (%)0.0%
Memory size12.5 KiB
2020-09-23T13:29:39.659507image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum8.4
5-th percentile9.2
Q19.5
median10.2
Q311.1
95-th percentile12.5
Maximum14.9
Range6.5
Interquartile range (IQR)1.6

Descriptive statistics

Standard deviation1.065667582
Coefficient of variation (CV)0.1022420904
Kurtosis0.2000293113
Mean10.42298311
Median Absolute Deviation (MAD)0.7
Skewness0.8608288069
Sum16666.35
Variance1.135647395
MonotocityNot monotonic
2020-09-23T13:29:39.939796image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
9.51398.7%
 
9.41036.4%
 
9.8784.9%
 
9.2724.5%
 
10.5674.2%
 
10674.2%
 
11593.7%
 
9.3593.7%
 
9.6593.7%
 
9.7543.4%
 
Other values (55)84252.7%
 
ValueCountFrequency (%) 
8.420.1%
 
8.510.1%
 
8.720.1%
 
8.820.1%
 
9301.9%
 
ValueCountFrequency (%) 
14.910.1%
 
1470.4%
 
13.640.3%
 
13.5666666710.1%
 
13.510.1%
 

quality
Real number (ℝ≥0)

Distinct6
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.636022514
Minimum3
Maximum8
Zeros0
Zeros (%)0.0%
Memory size12.5 KiB
2020-09-23T13:29:40.172652image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile5
Q15
median6
Q36
95-th percentile7
Maximum8
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8075694397
Coefficient of variation (CV)0.143287121
Kurtosis0.2967081198
Mean5.636022514
Median Absolute Deviation (MAD)1
Skewness0.2178015755
Sum9012
Variance0.6521684
MonotocityNot monotonic
2020-09-23T13:29:40.400474image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
568142.6%
 
663839.9%
 
719912.4%
 
4533.3%
 
8181.1%
 
3100.6%
 
ValueCountFrequency (%) 
3100.6%
 
4533.3%
 
568142.6%
 
663839.9%
 
719912.4%
 
ValueCountFrequency (%) 
8181.1%
 
719912.4%
 
663839.9%
 
568142.6%
 
4533.3%
 
Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.5 KiB
0
1382 
1
217 
ValueCountFrequency (%) 
0138286.4%
 
121713.6%
 
2020-09-23T13:29:40.561596image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Interactions

2020-09-23T13:28:52.526416image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:53.434729image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:53.744381image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:54.048245image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:54.327902image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:54.585219image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:54.912934image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:55.268336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:55.627212image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:55.889744image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:56.159634image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:56.413295image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:56.667520image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:56.931989image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:57.211204image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:57.490956image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:57.786909image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:58.069474image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:58.374096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:58.652781image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:58.945174image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:59.227379image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:59.521903image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:28:59.794333image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:00.069470image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:00.310306image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:00.568999image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:00.825135image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:01.089747image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:01.337386image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:01.588710image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:01.832221image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:02.079986image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:02.324188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:02.589625image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:02.845026image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:03.080107image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:03.330940image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:03.602191image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:03.847987image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:04.090119image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:04.348727image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:04.621352image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:04.908911image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:05.191054image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:05.464339image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:05.744300image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:06.005422image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:06.280680image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:06.545324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:06.823852image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:07.109642image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:07.409120image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:07.680135image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:07.954124image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:08.225029image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:08.501503image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:08.758474image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:09.029452image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:09.294891image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:09.545550image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:09.807517image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:10.066704image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:10.328131image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:10.586622image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:10.859005image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:11.144378image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:11.407152image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:11.694964image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:11.947043image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:12.217009image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:12.481392image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:12.740103image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:12.984471image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:13.253086image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:13.498066image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:13.758733image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:13.996485image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:14.252818image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:14.506323image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:14.792750image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:15.054877image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:15.324179image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:15.598210image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:15.847657image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:16.110592image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:16.407238image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:16.696164image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:16.983841image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:17.262294image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:17.546671image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:17.813589image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:18.082088image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:18.373741image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:18.653667image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:19.463048image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:19.739778image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:20.003663image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:20.280630image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:20.533715image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:20.799215image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:21.060727image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:21.330196image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:21.580898image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:21.853354image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:22.099266image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:22.359281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:22.635912image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:22.891699image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:23.166447image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:23.450630image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:23.722488image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:23.976886image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:24.234260image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:24.505843image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:24.781547image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:25.058971image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:25.337900image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:25.607137image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:25.881192image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:26.194585image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:26.492440image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:26.788198image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:27.046148image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:27.314686image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:27.573614image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:27.838620image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:28.062996image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:28.324000image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:28.554169image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:28.806707image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:29.050724image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:29.286837image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:29.533117image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:29.784346image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:30.013395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:30.255108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:30.506001image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:30.766166image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:31.012592image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:31.279116image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:31.512021image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:31.767952image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:31.995214image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-09-23T13:29:40.735846image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-09-23T13:29:41.177237image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-09-23T13:29:41.596200image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-09-23T13:29:42.031790image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-09-23T13:29:32.498643image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-23T13:29:33.097533image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

fixed acidityvolatile aciditycitric acidresidual sugarchloridesfree sulfur dioxidetotal sulfur dioxidedensitypHsulphatesalcoholqualityquality_bool
07.40.700.001.90.07611.034.00.99783.510.569.45.00
17.80.880.002.60.09825.067.00.99683.200.689.85.00
27.80.760.042.30.09215.054.00.99703.260.659.85.00
311.20.280.561.90.07517.060.00.99803.160.589.86.00
47.40.700.001.90.07611.034.00.99783.510.569.45.00
57.40.660.001.80.07513.040.00.99783.510.569.45.00
67.90.600.061.60.06915.059.00.99643.300.469.45.00
77.30.650.001.20.06515.021.00.99463.390.4710.07.01
87.80.580.022.00.0739.018.00.99683.360.579.57.01
97.50.500.366.10.07117.0102.00.99783.350.8010.55.00

Last rows

fixed acidityvolatile aciditycitric acidresidual sugarchloridesfree sulfur dioxidetotal sulfur dioxidedensitypHsulphatesalcoholqualityquality_bool
15896.60.7250.207.80.07329.079.00.997703.290.549.25.00
15906.30.5500.151.80.07726.035.00.993143.320.8211.66.00
15915.40.7400.091.70.08916.026.00.994023.670.5611.66.00
15926.30.5100.132.30.07629.040.00.995743.420.7511.06.00
15936.80.6200.081.90.06828.038.00.996513.420.829.56.00
15946.20.6000.082.00.09032.044.00.994903.450.5810.55.00
15955.90.5500.102.20.06239.051.00.995123.520.7611.26.00
15966.30.5100.132.30.07629.040.00.995743.420.7511.06.00
15975.90.6450.122.00.07532.044.00.995473.570.7110.25.00
15986.00.3100.473.60.06718.042.00.995493.390.6611.06.00

Duplicate rows

Most frequent

fixed acidityvolatile aciditycitric acidresidual sugarchloridesfree sulfur dioxidetotal sulfur dioxidedensitypHsulphatesalcoholqualityquality_boolcount
226.70.4600.241.70.07718.034.00.994803.390.6010.66.004
527.20.3600.462.10.07424.044.00.995343.400.8511.07.014
637.20.6950.132.00.07612.020.00.995463.290.5410.15.004
817.50.5100.021.70.08413.031.00.995383.360.5410.56.004
56.00.5000.001.40.05715.026.00.994483.360.459.55.003
126.40.6400.211.80.08114.031.00.996893.590.669.85.003
397.00.6500.022.10.0668.025.00.997203.470.679.56.003
407.00.6900.072.50.09115.021.00.995723.380.6011.36.003
607.20.6300.001.90.09714.038.00.996753.370.589.06.003
1047.80.6000.262.00.08031.0131.00.996223.210.529.95.003